Optimizing Schema Languages for XML: Numerical Constraints and Interleaving
نویسندگان
چکیده
The presence of a schema offers many advantages in processing, translating, querying, and storage of XML data. Basic decision problems like equivalence, inclusion, and non-emptiness of intersection of schemas form the basic building blocks for schema optimization and integration, and algorithms for static analysis of transformations. It is thereby paramount to establish the exact complexity of these problems. Most common schema languages for XML can be adequately modeled by some kind of grammar with regular expressions at right-hand sides. In this paper, we observe that apart from the usual regular operators of union, concatenation and Kleene-star, schema languages also allow numerical occurrence constraints and interleaving operators. Although the expressiveness of these operators remain within the regular languages, their presence or absence has significant impact on the complexity of the basic decision problems. We present a complete overview of the complexity of the basic decision problems for DTDs, XSDs and Relax NG with regular expressions incorporating numerical occurrence constraints and interleaving. We also discuss chain regular expressions and the complexity of the schema simplification problem incorporating the new operators.
منابع مشابه
Linear Time Membership for a Class of XML Types with Interleaving and Counting
Regular Expressions (REs) form the basis of most XML type languages, such as DTDs, XML Schema types, and XDuce types (Thompson et al. 2004; Hosoya and Pierce 2003). In this context, the interleaving operator would be a natural addition to the language of REs, as witnessed by the presence of limited forms of interleaving in XSD (the all group), Relax-NG, and SGML. Unfortunately, membership check...
متن کاملQuery Reasoning on Trees with Types, Interleaving, and Counting
A major challenge of query language design is the combination of expressivity with effective static analyses such as query containment. In the setting of XML, documents are seen as finite trees, whose structure may additionally be constrained by type constraints such as those described by an XML schema. We consider the problem of query containment in the presence of type constraints for a class...
متن کاملThe Membership Problem for Regular Expressions with Unordered Concatenation and Numerical Constraints
We study the membership problem for regular expressions extended with operators for unordered concatenation and numerical constraints. The unordered concatenation of a set of regular expressions denotes all sequences consisting of exactly one word denoted by each of the expressions. Numerical constraints are an extension of regular expressions used in many applications, e.g. text search (e.g., ...
متن کاملDTD++ 2.0: Adding support for co-constraints
In this paper we present an evolution of the DTD++ schema language for XML documents. The original DTD++ language provided support for a large and significant subset of XML Schema while maintaining a syntax closely resembling DTDs: thus the expressive power of XML Schema and the readability of DTDs were both supported in a modular architecture that could rely on a number of validating engine fo...
متن کاملTranslation of Structural Constraints from Conceptual Model for XML to Schematron
Today, XML (eXtensible Markup Language) is a standard for exchange inside and among IT infrastructures. For the exchange to work an XML format must be negotiated between the communicating parties. The format is often expressed as an XML schema. In our previous work, we introduced a conceptual model for XML, which utilizes modeling, evolution and maintenance of a set of XML schemas and allows sc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007